arXiv:1507.06015v2 [q-fin.RM] 31Jul2016 


Risk Quantification in Stochastic Simulation under 

Input Uncertainty 

Helin Zhu and Enlu Zhou 

H. Milton Stewart School of Industrial and Systems Engineering, 
Georgia Institute of Technology 


Abstract 

When simulating a complex stochastic system, the behavior of output response de¬ 
pends on input parameters estimated from finite real-world data, and the finiteness 
of data brings input uncertainty into the system. The quantification of the impact 
of input uncertainty on output response has been extensively studied. Most of the 
existing literature focuses on providing inferences on the mean response at the true 
but unknown input parameter, including point estimation and confidence interval con¬ 
struction. Risk quantification of mean response under input uncertainty often plays 
an important role in system evaluation and control, because it provides inferences on 
extreme scenarios of mean response in all possible input models. To the best of our 
knowledge, it has rarely been systematically studied in the literature. In this paper, 
first we introduce risk measures of mean response under input uncertainty, and pro¬ 
pose a nested Monte Carlo simulation approach to estimate them. Then we develop 
asymptotical properties such as consistency and asymptotic normality for the proposed 
nested risk estimators. Finally we study the associated budget allocation problem for 
efficient nested risk simulation. 

Key words: Input uncertainty, risk quantification, Monte Carlo simulation, nested 
risk estimators, budget allocation. 


1 Introduction and Motivation 

For a complex real-world stochastic system, simulation is a powerful tool to analyze its be¬ 
havior when real experiments on the system are expensive or difficult to conduct. Simulation 
is driven by input models that are distributions capturing the randomness in the system. 
For example, when simulating a queueing network, the random customer arrival and service 
times are generated from appropriate distributions (i.e., input models). The uncertainty 
on input parameters (e.g., customer arrival rates and service rates) may need to be taken 
into account, since they are typically estimated from hnite records of historical data. In 
general, there are two sources of uncertainty in a typical stochastic simulation experiment: 
the extrinsic uncertainty on input parameters (referred to as input parameter uneertainty, 
or simply input uneertainty) that reflects the variability of the hnite data used to estimate 
input parameters, and the intrinsic uncertainty on output response (referred to as stoehastie 
uneertainty) that rehects the inherent stochasticity of the system. 
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The variability of simulation output response clearly depends on both input uncertainty 
and stochastic uncertainty. An important question to address is how to quantify the im¬ 
pact of input uncertainty on output response variability in the presence of stochastic un¬ 
certainty. Various quantihcation methods have been proposed, including frequentist and 
Bayesian methods among many others. Frequentist methods include the Direct/Bootstrap 
Resampling methods by Barton and Schruben (1993, 2001), Cheng and Holloand (1997), etc. 
The input model for these methods can be a non-parametric empirical distribution or a para¬ 
metric distribution estimated from historical data. Bayesian methods include the Bayesian 


Model Averaging (BMA) methods by Chick (2001), Zouaoui and Wilson (2003, 2004), Biller 


and Corlu (2011), etc. In these methods, a Bayesian updating rule is applied on a chosen 


prior distribution of input parameter to generate a posterior parameter distribution, which 
is then used as the sampling distribution of input parameter in the simulation experiment. 
In addition to these methods, Cheng and Holloand (1997) also develops the 5-method, which 
decomposes the variance of output response into two components that are caused by input 
uncertainty and stochastic uncertainty, respectively. Song and Nelson (2015) develops a 


method for quickly assessing the relative contribution of each input distribution to the over¬ 
all variance. In recent years, with the rise of stochastic kriging in stochastic simulation (e.g.. 


Ankenman et al. (2010)), meta-model assisted methods have been developed for quantifying 


input uncertainty, see Barton et al. (2013), Xie et al. (2014, 2015), etc. Henderson (2003) 
provides an early review on the importance of input uncertainty and common methods to 
deal with it. Barton (2012) provides a more recent review on popular methods in output 
analysis under input uncertainty, and highlights some remaining challenges in this area. 

Some of the aforementioned works aim at providing inferences on the mean response at 
the true but unknown input parameter, often through point estimation and conhdence in¬ 
terval (Cl) construction. Some others focus on obtaining an empirical distribution of mean 
response, and providing a more complete picture of all possible scenarios of mean response 
under input uncertainty. However, to the best of our knowledge, rigorous quantihcation of 
extreme scenarios of mean response in all possible input models is still lacking. Such quan¬ 
tihcation could provide inferences on system sensitivity or robustness to input uncertainty, 
and thus would be critical for control of the system. 

For example, consider the system of a typical hospital emergency room (ER). When 
the administrators of ER determine the number of on-call doctors, one of the main system 
responses that needs to be monitored is the average patient waiting time. It is critical to 
assess and control the risk of extreme mean response scenarios, i.e., the risk of large average 
patient waiting times, in all possible input models, since this might lead to delayed treatment 
of patients and serious consequences in life-threatening situations. 

For another example, consider a large-scale power system. It is usually too expensive 
or risky to conduct real experiments on the system operation, and therefore, stochastic 
simulation is often used to study the economics, reliability, and emission variable ehects of 
power systems operating in a market environment (Degeilh and Gross (2015)). In a typical 
power system simulation experiment, the inputs may include the resource parameters, the 
loading (market demand) parameters, etc., which all exhibit variability and uncertainty. 
The risk quantihcation and management of system performance under input uncertainty is 
of great importance because extreme scenarios of mean response (e.g., high mean power 
loads during peak time) might cause a part or whole breakdown of the power system and 
lead to disastrous outcomes. 

In this paper, we aim to quantify the risk in stochastic simulation under input uncertainty. 
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by studying risk measures of mean response w.r.t. the distribution of input parameter. 
We will focus on risk measures such as Value-at-Risk (VaR) and Conditional Value-at-Risk 
(CVaR). Loosely speaking, VaR characterizes an extreme (e.g., 99%) quantile of the mean 
response distribution, and CVaR characterizes the conditional expectation of a very tail 
portion of the mean response distribution. VaR, as one of the very earliest risk measures 
introduced in hnancial risk management, is easy to understand and interpret for practitioners. 
CVaR, as a classic coherent risk measure (see, e.g., Artzner et al. (|1999)), exhibits nice 


properties such as convexity and monotonicity for optimization (see, e.g., Rockafellar and 


Uryasev (2000)). They have been extensively used in hnancial industry, especially after the 
hnancial crisis in 2008. An abundant literature has dedicated to studying the estimation 
and optimization of risk measures under various settings; in particular, Hong et al. (2014) 
provides a comprehensive review of Monte Carlo methods for VaR and CVaR. 

We will introduce VaR and CVaR for quantifying the risk in stochastic simulation under 
input uncertainty, and provide numerical schemes for their estimation. Specihcally, we will 
study nested Monte Carlo estimators for VaR and CVaR of mean response from both the¬ 
oretical and computational perspectives. Our numerical examples illustrate the importance 
and necessity of risk quantihcation under input uncertainty. To summarize, the contributions 
of this paper are three-folds: 


(1) For output analysis in stochastic simulation, our work is among the hrst to systemati¬ 
cally study risk quantihcation of mean response in all possible input models using risk 
measures. 

(2) Under the respective “Weak Assumption” and “Strong Assumption” (elaborated in 
Section]^, we show that the proposed nested risk estimators are consistent and asymp¬ 
totically normally distributed in diherent limiting senses, which are the guarantees for 
constructing asymptotically valid CIs. 

(3) We solve the associated budget allocation problem that arises in nested simulation 
of risk estimators, in order to improve simulation efficiency. The numerical study 
demonstrates the ehectiveness of our approach and shows that the obtained budget 
allocation schemes drastically reduce the widths of the CIs constructed. 


We note that, in a broader sense, our framework bears some similarity with risk assess¬ 
ment in credit management, since both of them deal with simulating certain conditional 
expectations. The work most relevant to ours is probably Gordy and Junejal (2010), 


m 


which the authors study the asymptotic representation of the Mean Squared Error (MSE) of 
nested risk estimators in credit risk management. By minimizing MSE asymptotically, they 
obtain an (asymptotically) optimal budget allocation scheme. In contrast, our work focus 
on the analysis of asymptotical properties such as consistency and asymptotic normality of 
the proposed nested risk estimators. Furthermore, the associated budget allocation prob¬ 
lem in our approach is to minimize the widths of the CIs constructed, and as a result our 
solution strategy and optimal budget allocation schemes are drastically different from the 


ones in 

Gordy and Juneja 

(2010 

). We acknowledge that part of our analysis follows from 

the assumptions and analysis in 

Gordy and Juneja 

(2010 

)• 


Other common approaches for credit risk management include but not limited to the 


delta-gamma method by 

Rouvinez 

(1997) 

Glasserman et al. 

2000 

), etc; the two-level con- 

hdence interval procedure with screening by 

Lan et al. (2011 

)), etc; the stochastic kriging 
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method by Liu and Staum (2010), etc; the ranking and selection method by Broadie et ah 


(2011), etc. Among other relevant literature, Lee (1998) studies point estimation of a quan¬ 


tile (VaR) of the distribution of a conditional expectation via a two-level simulation; Steckley 


(2006) considers estimating the density of a conditional expectation using kernel density es¬ 
timation; Sun et ah (2011) studies efficient nested simulation for estimating the variance of a 
conditional expectation. Most of these works focus on efficient allocation of inner simulation 
sizes across different outer scenarios, and Lee (1998), Steckley (2006), and Sun et ah (2011) 
consider optimal allocation between inner and outer sampling. Our work distinguishes from 
these works in that we focus on the theoretical properties of nested risk estimators, and our 
budget allocation scheme can be viewed as a byproduct of the theoretical properties estab¬ 
lished. We do point out that varying inner-layer sample sizes across different outer-layer 
scenarios, as studied in some of the aforementioned works, could be further incorporated 
here to improve simulation efficiency; however, it is beyond the scope of this paper. 

The rest of the paper is organized as follows. In Section we introduce risk measures 
VaR and CVaR of mean response w.r.t. input uncertainty, and propose nested risk estima¬ 
tors for risk quantihcation in stochastic simulation under input uncertainty. In Section 
we establish the asymptotical properties of the proposed nested risk estimators, and then 
construct asymptotically valid CIs. We formulate the associated budget allocation problem 
and propose a new approach to solve it in Section In Section we conduct numerical ex¬ 
periments to demonstrate some of the theoretical results from previous sections. Conclusions 
are provided in Section 


2 Risk Measures of Mean Response under Input Un¬ 
certainty 

2.1 Formulation 

Let us hrst dehne risk measures VaR and CVaR of mean response rigorously under input 
uncertainty. 

In a stochastic simulation experiment, consider a response function in the form of h{6]^), 
where 6 represents the input parameter(s) and ^ represents the noise (stochastic uncer¬ 
tainty) in the response. Let H{6) = E^[h(0;^)] be the mean response, and thus h{6-,^) = 
H{6) + where £{0]^) is the stochastic noise that satishes E[£^(6*;^)|6'] = 0 and 

Var[£{d]^)\d] = Tq. Here assume Tq is a hnite deterministic function of 9. Furthermore, 
suppose there is a probability distribution (called “belief distribution”) on 6 that reflects 
our belief on input uncertainty, since 9 needs to be inferred from hnite historical data. For 
example, if one takes a Bayesian approach, then the belief distribution is constructed via 
Bayesian updating. Of course, there are other approaches such as bootstrapping. Specih- 
cally, suppose Po{9) is a prior distribution on 9, and it could be either non-informative or 
informative depending on prior knowledge. Then the posterior distribution p(-|x) is obtained 
via sequential Bayesian updating with historical data x. Assume := f Tgp(9jx)d9 is also 
hnite. 

Let 0 < a < 1 be the risk level of interest (e.g., a = 0.99). Then VaR of the mean 
response ^(9), denoted by Va (E^[h(6*; ^)]) (or interchangeably Va (11(9))), is dehned by the 
a-quantile of ff(9), i.e., 

Va (H(9)) = inf{f : F(t) > a}, (2.1) 


4 

























where F{-) is the cumulative distribution function (c.d.f.) of H{6). When H{6) admits 
a positive and continuous probability density function (p.d.f.), which is denoted by /(•), 
around Va {H{9)), (2.1) can be simplihed as Va {H{6)) = F~^{a). CVaR of H{9), denoted 


by Cq (E^[h(6';,^)]) (or interchangeably Ca {H{9))), is dehned by the conditional expectation 
of the a-tail distribution of H{9), i.e., 


(H(e)) = vJH(e)) + - t>„(i/(»)))+] . 


( 2 . 2 ) 


With slight abuse of notations, we use Va and Cq, as the abbreviations for Va {H{9)) and 
Ca {H{9)), respectively. 

Calculating risk measures such as Va and Ca is straightforward when the system is simple. 
For example, when the p.d.f. of H{9) admits an explicit expression, VaR or CVaR of H{9) 
could be calculated via numerical integration. 


2.2 Nested Simulation of VaR and CVaR 

Let us hrst consider Monte Carlo estimation of Va and Cq, without the presence of stochastic 
uncertainty. That is, H{9) can be evaluated exactly for all 9. 

First, draw N i.i.d. scenarios 9i, ...,9 n from the belief distribution p(6*|x); then, simulate 
{H{9i) : i = l,...,iV} and sort them in ascending order, denoted by H{9{i)) < H{9(^2)) < 
■ ■ ■ < Ff(6*(Ar)); finally, estimators of Va and Ca are given, respectively, by 


H{9^aN))i 




N 


(l-a)JV^ 


^ moi) - , 


where for convenience we assume aN is an integer. Intuitively, Va is the a-level VaR of 
the empirical mean response distribution consisting of {H{9(^i)) : i = l,...,iV}. In parallel, 
Cq, is the a-level CVaR of the empirical mean response distribution. The properties of 
Va and Cq have been well-studied in the literature. For example, although Fq and Cq are 
not unbiased, they are strongly consistent and asymptotically normally distributed under 


appropriate regularity conditions (Sun and Hong (2010)). 

When stochastic uncertainty is present, the exact value of H{9) might not be readily 
available; instead, it is estimated via sample averaging. Naturally, to obtain estimators of Va 
and Cq, we can extend the estimation procedure described above by replacing {H{9i)} with 
their sample average estimates {H{9i)}. Specihcally, for each input scenario 9i, simulate 
M i.i.d. response samples {h{9i]^ij) : j = then, approximate H{9i) by HM{9i) = 

'h iij) and sort them in ascending order, denoted by Hm{9^^'>) < Hm{9‘^‘^^) < ■ ■ ■ < 

hnally, estimate Va and Cq, respectively, by 


Fq = 

^ ^ 1 
Cq — Va “1“ 


N 


(l-a)JV^ 




(2.3) 

(2.4) 


We refer to Va or Cq as “nested risk estimator”, since nested simulation is incurred in the 
estimation. With the implication of stochastic uncertainty, the asymptotical properties of 
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Va and Cq, become more complicated. In next section, we will show that and maintain 
to be strongly consistent and asymptotically normally distribnted in different limiting senses 
nnder different sets of regnlarity conditions. Hence, nsing them as inferences for Va and Cq, 
respectively, is still reasonable. 

Note that the ordered statistics and ( 6 ^( 1 ), ..., 6 '(^r)) are different. In fact, for 

fixed inpnt scenarios 9i,...,9n, ( 0 ( 1 ),..., 6 '(Ar)) is a constant vector, while {9^^\ ...,9^^'>) is a 
random permutation of ( 0 (i), ..., 0 (iv)) that depends on the realizations of {h{9i;^ij)}. 


Remark 2.1. In Barton et al.\ 12013 ) and Xie et al. (2014), the authors use nested VaR 
estimator Vp /2 andvi_p /2 as the lower-upper boundaries of a credible interval (CrI) for H{9c) 
with confidence level (1—p), where H{9c) is the mean response at the true but unknown input 
parameter Oc- The purpose is to cover the structural bias (Ep(.|x)[hf(0)] — H{9c)) from using 
^Yl!i=iHM{9i) as an estimator of H{9c)- 


3 Asymptotic Analysis of Nested VaR and CVaR Es¬ 
timators 


In this section, we analyze the asymptotical properties of nested risk estimators Va and 
Cq,, as the inner and outer sample sizes both go to infinity. In particular, we will prove 
their strong consistency and asymptotic normality in different limiting senses under different 
sets of regularity assumptions, which are referred to as “Weak Assumption” and “Strong 
Assumption”, respectively. 


Assumption 3.1. Weak Assumption. 

(i) The response h{9]f() has finite conditional second moment, i.e., Tq = E[h^(0;^)|0] < oo 
w.p.l and = J Tgp{9\x.)d9 < oo. 

(a) The p.d.f. /(■) of the mean response H{9) is positive and continuous, and continuously 
differentiable around Va- 


Assumption |3.1| is weak in the sense that it imposes separate assumptions on input 
uncertainty and stochastic uncertainty. In contrast to the following Strong Assumption, it 
does not impose joint assumptions on input uncertainty and stochastic uncertainty. 

Notice that 


M 




M ^ ^ 




M 




i=i 


Let us define a normalized noise term Sm 

1 ^ 
J = 1 


By Central Limit Theorem, under appropriate assumptions Em has a limiting distribution 
as M —)■ oo. Note that Hm{9) = H{9) + Em/^/M, then the effect of the diminishing noise 
term Em/VM on the distribution of Hm{9) will vanish as M —)■ oo. Therefore, we expect 
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the “distance” between the distribution of HM{d) and the distribution of H{9) to vanish as 
M —)■ oo. That is, /m —t / as M —>■ oo, where /m(') represents the p.d.f. of Hm{6)- In 
particular, the following Strong Assumption guarantees that /m converges to / sufficiently 
fast. 

Assumption 3.2. Strong Assumption. 

(i) The response h{9;^) has finite conditional second moment, i.e., Tq = K[h‘^{9;^)\9] < oo 
w.p.l and = f Tgp(9lx)d9 < oo. 

fiij The joint density pM{h,e) of H{9) and Em, and its partial derivatives ■^pM{h,e) and 
■§^PM{h, e) exist for each M and for all pairs of (h, e). 

(Hi) There exist nonnegative functions 5'o,m(-); and 5 ' 2 ,m(-) such that puifi^e) < 

\'§hPM{h,e)\ < gi^M{e), ^gM{h,e) < g 2 ,M{e) for all {h,e). Furthermore, 
sup / \efgi,M{,a)de < oo for i = 0,1, 2, and 0 < r < 4. 

M 


Assumption |3.2 is strong in the sense that it imposes joint assumptions on input uncer¬ 
tainty and stochastic uncertainty. In particular. Assumption 3.2 (i) ensures that Em has a 


li miting distribution as M —>■ oo] Assumption 3.2 (ii) and 3.2 (iii) (similar to Assumption 1 


Gordy and Juneja (2010[)) ensure that the distance between /m(-) and /(■) is of the order 


of 

O 

and f have good structural properties (e.g., hnite moments up to some order). Note that 
when Strong Assumption holds. Weak Assumption naturally holds. 


A). Assumption 3.2 holds when h(-, •) is sufficiently smooth, and the distributions of 9 


3.1 Consistency 

It turns out, under Weak Assumption, nested risk estimators Va and Cq are consistent in the 
sense that they converge to Va and Cq, w.p.l, respectively, when M hrst goes to inhnity and 


then N goes to inhnity. In particular, we have the following Theorem 3.1 on the consistency 
of Va and Ca under Weak Assumption. 


Theorem 3.1. Consistency under Weak Assumption. Under Assumption 3.1, we 
have 

(3.1) 


lim lim Va = Va, w.p.l, and lim lim Cq, = Cq, w.p.l. 

N^oo M—^oo N^oo M^oo 


Proof. See Appendix 


□ 


Note that in Theorem |3.1 the limits on N and M are iterated and non-interchangeable. 
Intuitively, the inner sample size M going to inhnity ensures that, for any hxed 9, Hm{9) —)■ 

H{9) w.p.l (by Strong Law of Large Numbers). It follows that for hxed 9i, ..., 9^, the random 
order statistics ( 6 ^*^^^ ..., 9^^'^) —)■ ( 6 ^( 1 ), ...,9(^n)) w.p.l. as M ^ oo. Thus, {HM{9Alfi Hm{9^^'>)) —>■ 
{H{9(^ifi,..., H{9(^]\f'f)) w.p.l. as M —)■ cx). It follows that Va —t Va and Cq, —)■ Cq, w.p.l as 
M ^ oo. In view of the fact that Va —)■ Uq and Cq, —)■ w.p.l as A —)■ cx). Theorem 3.1 

holds. 


When Strong Assumption is imposed, we could strengthen the results in Theorem 3.1 In 


particular, the following Theorem 3^ shows that the iterated limits on N and M in Theorem 
13.11 could be relaxed into simultaneous limits. 
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Theorem 3.2. Consistency under Strong Assumption. Under Assumption 3.2. we 
have 

(3.2) 


lim Va = Va, w.n.l, and lim c„ = c^, w.n.l. 

N,M^oo N,M^oo 


Proof. See Appendix 


□ 


Theorem |3.2 implies Va and Cq converge to Va and Cq w.p.l, respectively, when N and 
M go to inhnity simultaneously. The intuition is as follows. For an arbitrary M, let us 
bound the difference between Va{HM{0)) (or Ca{HM{0))) and Va{H{6)) (or Ca{H{9))), where 
note that Va{HM{9)) is Va R of Hm{9) and Ca{HM{9)) is CVaR of Hm{9)- As mentioned 

ensures that the distance between /m(') and /(■) is of the order 


3.2 


previously, Assumption 
0(A). It follows that the difference between Va{HM{d)) and Va{H{6)) is also of the order 
O(^). Furthermore, note that Va could be regarded as an one-layer estimator of 


i.e.. 


Under Assumption 


3.2 


we could show that Va{HM{9)), i.e., Va{H{9)), converges to Va{HM{9)) 
M as N ^ oo. Therefore, Va{H{9)) converges to Va{H{9)) w.p.l as 


w.p.l uniformly for af 
N and M go to inhnity simultaneously. Hence, Theorem |3.2| holds 


3.2 Asymptotic Normality and Confidence Intervals 

After showing the consistency of Va and Cq,, it is natural to consider their asymptotic nor¬ 
mality properties and construct the associated CIs. 

Let us hrst investigate the asymptotic normality of Va and Cq, under Weak Assumption. 


Following the logics in Theorem 3.1, the total error of nested risk estimator Va or Ca is decom¬ 
posed into two components that are caused by input uncertainty and stochastic uncertainty, 
respectively. In particular. 


A 


Vo. = (!>a - Vo) + (Vo - Vo) = Evv, + Err2, 


(3.3) 


and 


Vo-Vo = (c„ - Co) + (?„ - Co) = Errs + Errs, 


(3.4) 


where Erri (or Errf) is caused by stochastic uncertainty, and Err 2 (or Err/f) is caused by 
input uncertainty. Furthermore, Erri (or Err^) and Err 2 (or Err/f) are correlated, and 
the correlation is difficult to characterize. Therefore, it is natural to establish asymptotic 
normality for each of the error terms independently, leading to a two-level procedure for 
constructing the associated CL That is, constructing a Cl for Erri (or Err^) and a Cl for 
Err 2 (or Err^) independently, and then integrating the two CIs into a wider Cl for Va (or 

Cq.) . 

The following Theorem |3.3 establishes the asymptotic normality for Va and Cq under 
Weak Assumption, and provides explicit characterizations of the asymptotic variances. 


Theorem 3.3. Normality under Weak Assumption. Under Assumption 3.1, we have 


lim y/N {va — Uq) ^ ct^A/'(0, 1) and lim VN (cq, — c^) ^ (TcA/'(0, 1), 


V 


N- 


N^oo 


(3.5) 


















(3.6) 


where := / f {va) and ac ■= \Jvar[{H{9) — Va)^]/{1 — a). Furthermore, 

lim lim a/M {va — Va) ^ TyAf{0, 1), 

N^OO M^OO 

lim lim ^^(1 — a)NM {va — Va) ^ rcA/'(0,1), 

N^OO M^OO 

where Ty := ■\JK[tq\H{6) = Vq] and Tc := ■\/'E[Tg\H{6) > Vq\ . 

Proof. See Appendix 

Let fd be the target error level (hence (1 — (5) is the target confidence level), we could 
construct CIs for Va and of confidence level (1 — fd) as follows. Following the error 
decompositions in ( |3.3 ) and (3.4), the error level fd is also decomposed into jdo and fdj 
(hence fd = fdo + fdi), which represent the error levels for the outer-layer simulation (input 
uncertainty) and the inner-layer simulation (stochastic uncertainty), respectively. 


□ 


Specifically, by (3.5) CIs for Erri and Err^ of confidence level (1 — fdo) are 


Va-Va G 


tpo/2,N-iav 

tl-Po/2,N-iav 


tpo/2,N-l<^c 

tl-Po/2,N-iac 

[ y/N 

y/N \ 


[ y/N 

y/N \ 


( 3 . 7 ) 


where ay is a sample estimate of ay = f{va), in which f{va) can be estimated 

using Gaussian kernel density estimation (Steckley (2006)); ay is a sample estimate of ay, 
and represents the y-quantile of a t-distribution with degree of freedom L. 

Similarly, by (3.6) CIs for Err 2 and Err^ of confidence level (1 — fdj) are 


Vyy -Vy, e 




y/M ’ y/M 

tl3i/2,{l-a)NM-lTc tl-jSi/2,{l-a)NM-lTy 

^y{l- a)NM ’ ^(1 - a)NM 

where % and % are sample estimates of Ty and Ty, respectively. 


Cyy G 


(3.8) 


Integrate the CIs in (3.7) and (3.8), a Cl for Va of confidence level (1 — fd) is 


~ ti3yyf2,N-lO'v tpi/2,M-lTv ~ ^l-/3o/2,iV-lC’'); tl-0i/2,M-^F 

^cx. ' /— ' / 5 ^cx. ' 


Vn 

and a Cl for Cq, of confidence level (1 — /3) is 


^/N 


y/M 


(3.9) 


~ I tl3ol2,N-l^c I ^/3//2,(l-o)AfM-l''‘c _ ^ ^l-/3o/2,Af-lCrc ^ ^l-/3j/2,(l-a)AfM-l''‘c 

Cq: + J= + / ^ ^ \ ^ -r , F ’ I 


y/N 


^(1 - a)NM 


yjN 


^{l-a)NM 


( 3 . 10 ) 


For simplicity, we refer to them as “CIs under Weak Assumption”. Note that we can control 
the simulation errors due to input uncertainty and stochastic uncertainty independently by 
choosing fdo and fdj as well as N and M appropriately. 

Under Strong Assumption, the asymptotic normality results for Va and are simpler in 


form. Following the logics in showing Theorem 3.2, the error of Va (or Cq) is decomposed 


into two components that account for the one-layer simulation error due to input uncertainty 
and the simulation bias due to stochastic uncertainty. By properly choosing N and M, we 
can make the bias component go to zero faster than the error component. Specifically, we 


have the following Theorem 3.4 on the asymptotic normality of Va and Cy 
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Theorem 3.4. Normality under Strong Assumption. Under Assumption 3.2, N = 
is a sufficient and necessary condition for 


lim \/]V (cq — Co) ^ crcA/'(0,1), (3.11) 

N.M^OO 


lim \fN iva — Va) ^ c^^,A/'(0,1) and 

N,M^oo 

where N = o{M‘^) means limM^-oo N/M"^ = 0. 

Proof. See Appendix □ 

Theorem |3.4 is consistent with the results in Gordy and Juneja (2010) on the charac¬ 


terizations of the asymptotic variances of Va and c^. We also note that Theorem 3.4 


IS 


stronger in that it directly leads to the results in Gordy and Juneja (2010). Moreover, 


by minimizing MSE, Gordy and Juneja (2010) shows that the variance and the bias of a 


nested risk estimator are balanced when the sample size pair {N, M) lives in the region of 
N = 0{M‘^). In contrast, Theorem 3.4 shows that a nested risk estimator is asymptotically 


normally distributed when {N,M) lives in the region of = o{M‘^). 

Following Theorem 3.4, we can construct GIs for Va and Cq, of conhdence level (1 — /9): 


Va + 


and 


Ca + 




tl3/2,N-lO'c 


Va + 


tl-i3/2,N-l<^v 

y/N 




Ca + 


tl-/3/2,N-lO'c 


(3.12) 


(3.13) 


Note that the GI in (3.12) or (3.13) only depends on N. It is because, when N = o{M‘^), 

B.3 in Appendix for 


the bias term due to stochastic uncertainty (see Lemma B.2 


or 


the explicit formula) is of the order O(^), and thus it will be asymptotically insignihcant 


compared with the error term. We refer to the GIs in (3.12) and (3.13) as “GIs under 

Strong Assumption”. 

Note that the GIs under Weak Assumption and Strong Assumption achieve different 
practical coverage probabilities when the target conhdence level is (1—/3). Due the relaxation 
in applying Boole’s Inequality for constructing GIs under Weak Assumption, the resulted 
GIs will achieve coverage probabilities greater than (1 — /?). In contrast, GIs under Strong 
Assumption will achieve a coverage probability of (1 —/?). The following Theorem 3.5 shows 


that GIs under both Weak Assumption and Strong Assumption are asymptotically valid. 
Theorem 3.5. Asymptotic Validity of CIs. 


(i) Under Assumption 3.1, the CIs defined in (3.9) and (3.10) are asymptotically valid. 


i.e., 


lim lim Pr{lb'f < Uq, < ub'f} >1-/5 and lim lim Pr{lbf < Ca < ubfj >1-/3, 

N—^'OO M^oo N^oo M^oo 


where Ibf and ubf denote the lower and upper boundaries of the Cl in (3.9), and Ib^ 


and ubf denote the lower and upper boundaries of the Cl in (3.10), respectively. 


'^c 
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(ii) Under Assumption 3.2, the CIs defined in (3.12) and (3.13) are asymptotically valid 
when N = o{M‘^), i.e., 

lim Prllhi < f« < uKA > 1 — fi and lim < ubA >1-/3, 

N,M^oo ^ ^ N,M^oo ^ ^ ^ 


where Ib^ and ub^ denote the lower and upper boundaries of the Cl in (3.12), and lb), 
and ub) denote the lower and upper boundaries of the Cl in (3.13), respectively. 

Proof. See Appendix □ 


4 Budget Allocation 


In practical simulation, usually there is a simulation budget that affects the choices of N 
and M. Intuitively, the outer sample size N determines the simulation error due to input 
uncertainty, while the inner sample size M determines the simulation error due to stochastic 
uncertainty. Therefore, choosing N and M appropriately is critical to balance the trade¬ 
off between capturing input uncertainty and capturing stochastic uncertainty, and improve 
overall efficiency. 

As shown in previous section, under Strong Assumption, the error of nested risk estimator 
Va (or Cq) could be decomposed into an error component caused by input uncertainty and 
a bias component caused by stochastic uncertainty. Within this framework, Gordy and 


Juneja (2010) proposes to minimize the asymptotic MSE, i.e., the summation of variance and 
squared bias, oiva- The result is an (asymptotically) optimal budget allocation scheme, N = 
that balances between the outer-layer sampling error and the inner-layer sampling 

bias. 

An alternative approach to improving simulation efficiency is to consider the optimal 
budget allocation problem under Weak Assumption. Note that, under weak assumption, the 
total error oiva (or Cq) is decomposed into two components that are asymptotically normally 
distributed and correspond to the simulation errors due to input uncertainty and stochastic 
uncertainty, respectively. Thus, an optimal budget allocation scheme can be determined by 
minimizing the width of the Cl in (3.9) or (3.10). This approach is a complement of the 
existing methods within the framework of efficient nested risk estimation. 

In particular, the Cl width minimization problem could be formulated as follows. Let 
Wy{N,M) and WfiN, W) be the half widths of the CIs in (3.9) and (3.10), respectively, i.e.. 


Wy{N,M) = 


tl-l3o/'2,N-lO'v 

Vn 


+ 


^/M 


(4.1) 


and 


WfiN, M) = 


Vn 


+ 


tl-/3j/2,{l-a)NM-lC 

V(1 - a)NM 


(4.2) 


They are the objective functions in the budget allocation problem. Note that there are 
four variables, i.e., /3o, fii, N, and M, to be determined via minimization of Wy{N,M) 
(or Wy{N, M)). To ease the optimization, we pre-select fio and /3j (a typical choice is 
fio = fii = /5/2). The constraints are as follows. Let C{N, M) := ciN -|- C 2 NM be the total 
computational cost, where ci is the cost for simulating one input parameter scenario, and C 2 
is the cost for simulating one response sample. Of course, there could be other minimization 
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criteria such as the overall computational complexity, and they can be minimized in a similar 
manner. Let CB be the total simulation budget. Consider the following Cl (half) width 
minimization problem 

min W(N,M) or min 

N,M N,M 

s.t C{N,M)<CB s.t 

N ^ Lg, M ^ Lg 
N,M eZ+ 


Wc{N, M) 

C{N, M) < CB 

iV > hg, M > Lg, (1 - a)NM > Lg 
N,M eZ+ 


Here the constraints iV>rg,M>rg and [1 — a) NM > Lg are imposed to ensure the 
validity of a f-statistics, and a typical choice for Lg is 30. 

Before solving problem (4.3), we still need to compute or estimate the “variance terms” 


o'v, Tv, Oc, and Tc in the objective function, since in practice they are usually unknown or un¬ 
available. A common £x is to run a pilot experiment with a small fraction of total simulation 
budget, and estimate the variance terms using the samples from the pilot experiment. Let 
us use 'ov, Tv, Uc and Tc to denote the estimates of cr^, Tv, <7c and Tc from the pilot experiment, 
respectively. They could be the obtained via sample averaging; however, this method might 
be very inaccurate since it involves rare-event simulation with few samples. For example, 
recall that 


= 


Var [{H{d)-Vo)^] 


(1 


a 


(1 


a 


E 




This indicates that estimation of is at least as difficult as estimation of Va- Using naive 
sample averaging to estimate cXc causes most of the samples to be ineffective, and thus 
results in an inaccurate estimate. In fact, theoretically only (1 — a) fraction of the samples 
will be effective; since a is close to 1, the percentage of effective samples is small. To be 
more specific, suppose a = 0.99 and N = 100 scenarios of H{9) are generated in the pilot 
experiment. Then theoretically only one scenario will be effective and used in the estimation, 
since the rest 99 scenarios result in a simple value of 0. 

One of the issues of the sample average method is that the information about the un¬ 
derlying distribution carried by the ineffective samples is not utilized. In contrast, a good 
estimation method usually makes use of the information carried by all the samples. For 
example, using (adaptive) importance sampling turns some of the ineffective samples into 
effective samples, and thus improves accuracy; however, this approach is not readily applica¬ 
ble here because we lack the knowledge about the p.d.f. of the mean response distribution. 

Next, we will propose a new approach to estimating the variance terms that exploits the 
information carried by all the samples generated in the pilot experiment. Recall that 


= a{l - a)/f{va), t^ = E[Tg\H{e) = u„], 

and 

= Var - <,„)+] /(I - a)^ = E[t||J/( 9) > t,„]. 

The challenges are two folds: (i) the lack of an explicit formula for /(■); (ii) the lack of a 
functional representation for t^(-), where r^(?/) := E[Tg\H{6) = y]. 

To address the hrst challenge, we apply a technique called “density projection”. That 
is, we project the discrete empirical distribution of H{6) onto a parameterized family of 
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continuous densities. Then the resulted projection, which is a continuous density, will be 
used as an approximation of f{-), and and Uc are computed via numerical integration. 
The detailed description of density projection is as follows. 

A projection mapping from a space of probability distributions V to another space con¬ 
sisting of a parameterized family of densities jF, denoted as Projjr : P —)■ is dehned 

by 

Proj^ig) = aigmin DKiig || /), e V, (4.4) 

where || /) denotes the Kullback-Leibler (KL) divergence between g and /, which is 


DKiig 


= j 9 {x) log ^^dx. 


fix) 


Here note that the densities g and / are assumed to have the same support. Hence, the 
projection of 5 ^ on has the minimum KL divergence from g among all densities in P. 
Loosely speaking, the projection of 5 ^ on is the best approximation of g one can hnd in P. 
When P is an exponential family of densities, which includes common families of densities 


such as Gaussian, the minimization problem (4.4) has an analytical solution. Note that this 


technique utilizes the information carried by all the samples. 


Remark 4.1. If i.i.d. samples of g are generated to compute Projjr{g), then the proposed 
density projection technigue is eguivalent to maximum likelihood estimation. Furthermore, if 
P is an exponential family of densities with sufficient statistics that consist of polynomials, 
then density projection is eguivalent to method of moments. 


To address the second challenge, we apply regression for r^(|/) onto the space of H{6), 
and use the samples from the pilot experiment to train the regression model. Simple numer¬ 
ical tests show that a polynomial regression with basis functions consisting of polynomials 
(degree< 3) of H{6) is sufficiently good. Then and rf are computed via numerical inte¬ 
gration. 

After plugging the approximate variance terms a^, %, Uc and p into problem (4.3), it 
remains to solve the minimization problem. Solving it analytically to optimality is unlikely 
because the objective function might not possess structural properties such as convexity. 
Alternatively, we can enumerate a reasonable amount of candidate allocation schemes (e.g., 
a two-dimensional grid of feasible allocation schemes), and choose the one scheme that yields 
the smallest Cl width. 

We also point out that it is benehcial to consider a more sophisticated budget allocation 
scheme in which the inner sample size varies across different input (parameter) scenarios. For 
example, in the estimation of Va, the input scenarios that heavily affect estimation accuracy 
are the ones with mean responses close to Vq. In particular, for a specihc input scenario, it 
affects estimation accuracy if the true mean response of that input scenario falls into one 
side of Va while its estimation falls into the other side. In this case, the inner sample size for 
this input scenario should be increased to reduce the probability of such event. This problem 
has been studied in the setting of nested credit risk assessment using ranking and selection 
(Broadie et al. (2011)) and screening (Lan et ah (2010)), etc. 
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5 


Numerical Experiments 


5.1 


Comparison between CIs under Weak Assumption and Strong 
Assumption 


We first use a simple numerical example from Gordy and Juneja (2010) to compare the 
Cl procedures under Weak Assumption and Strong Assumption (referred to as “weak Cl 
procedure” and “strong Cl procedure”), respectively. In particular, consider H{6]^) = 
A^(0,1) + W(0,1), a summation of two independent standard normal random variables. 


In Gordy and Juneja (2010), the first A/'(0,1) represents the (outer-layer) portfolio loss 
distribution and the second A/'(0,1) represents the (inner-layer) pricing error. Clearly, this 
example does not £t into our input uncertainty framework. The reason for using it is that the 
exact risk values, and all variance and bias parameters admit closed-form expressions. Thus, 
comparisons between weak Cl procedure and strong Cl procedure are precise. Performance 
measures of interest include Cl width and actual coverage probability, i.e., the probability 
that the true risk value falls into the simulated Cl. In particular, we will run the simulation 
1000 times independently and identically to compute the two performance measures, in 
which the optimal budget allocation scheme from minimizing Cl width is employed in weak 
Cl procedure and the optimal budget allocation scheme from minimizing MSE is employed 
in strong Cl procedure. The results for VaR (results for CVaR are similar, and thus omitted) 
are summarized in Table 15.11 


Table 5.1: Comparisons of Two Cl Procedures in VaR Estimation. 



Weak Cl Procedure 

Strong Cl Procedure 

C{N,M) 



Half Cl 
Width 

Coverage 

Probability 

Ns 

Ms 

Half Cl 
Width 

Coverage 

Probability 

10 ^ 

212 

47 

0.65 

100 % 

33 

311 

0.72 

94.7% 

10 ^ 

669 

149 

0.37 

100 % 

70 

1446 

0.50 

95.9% 

10 ^ 

2114 

473 

0.21 

100 % 

149 

6716 

0.34 

95.8% 

10 -^ 

6683 

1496 

0.12 

100 % 

321 

31173 

0.23 

95.8% 


The risk level of interest a = 0.95, the target confidence level (1 — /3) = 0.95, and the total simulation cost 
C{N, M) = NM + N. The pair (i\V, M^) is the optimal budget allocation obtained by minimizing the width 
of a Cl under Weak Assumption, while {Ns,Ms) is the optimal budget allocation obtained by minimizing 
MSE of Va under Strong Assumption. The coverage probabilities are obtained via 1000 independent and 
identical runs of simulation. The CIs under Weak Assumption do not take the bias into account while the 
CIs under Strong Assumption do. 


The numerical results show that: 1) The optimal budget allocation schemes for weak Cl 
procedure and strong Cl procedure could be drastically different. 2) In general, compared 
with strong Cl procedure, weak Cl procedure generates narrower CIs. 3) As expected, 
weak Cl procedure generates CIs with coverage probabilities (100%) greater than the target 
conhdence level (95%) while strong Cl procedure generates CIs with coverage probabilities 
equal to 95%. 

Overall, weak Cl procedure appears to be better than strong one in the sense that it 
generates narrower CIs with higher coverage probabilities; however, we should note that the 
budget allocation scheme for the strong Cl procedure aims at minimizing MSE instead of 
Cl width. 
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5.2 An M/M/1 Queue 

Let us consider another example for risk quantification under input uncertainty—an M/M/1 
queueing system from Zouaoui and Wilson (2003). In particular, we focus on estimating 


the risk of mean sojourn time due to input uncertainty. In the M/M/1 queueing system, 
assume the “true” Poisson customer arrival rate is Aq, which means the inter-arrival times 
between customers are independently sampled from an exponential distribution with rate Aq. 
Further assume the “true” exponential service rate is po, which means the service time for 
each customer is sampled from an exponential distribution with rate /Xo- Here “true” means 
that the values of Ao and po are known to us (the judges) but not known to the experimenter. 


We will mainly follow the experiment parameter set-up in Zouaoui and Wilson (2003), i.e., 
Po = 500 and Aq = 50, 250,450 — a range of values corresponding to increasing levels of true 
arrival intensity. To model input uncertainty, we take a Bayesian approach to construct 
the belief distribution on input parameters—the Poisson arrival rate A and the exponential 
service rate fi. Specihcally, assume non-informative priors for both A and /x, i.e., Po(A) cx 1/A 
and Po(/^) cc l//i. Based on n = 10,100,10000 historical observations of A and fi (drawn 
from the corresponding distributions with the true parameters), a Bayesian updating is 
applied to obtain the posterior distributions of A and /i. In particular, denote the historical 
observations of A by x = (xi,...,x„). Then the updating on the posterior distribution 
of A is carried out analytically and leads to p(A|x) = A""^ exp (—A Xj), which is a 
Gamma distribution with shape parameter n and scale parameter l/(X]r=i^*)- Similarly, 
let y = (x/i,..., Vn) be the historical observations of /i. Then the posterior distribution of fi is 
p(/i|y) = exp (—/i l/j)—a Gamma distribution with shape parameter n and scale 

parameter 1/(X]r=i Vi)- 

The objective is to estimate Va and Cq, (a = 0.90,0.95,0.99) of mean sojourn time w.r.t. 
the posterior parameter distributions p(A|x) and p{fi\y), and construct the associated 100(1 — 
f3)% GIs (/3 = 0.05). In particular, we draw N = 5000 input parameter scenarios from p(A|x) 
and p{fi\y) that satishes X < p (requirement of a stable queue). Furthermore, for each input 
parameter scenario, we draw M = 200 samples of sojourn times by simulating the queue’s 
first 200 sojourn cycles to estimate its mean sojourn time. Finally, Va and Ca of mean sojourn 


time are estimated via (2.3) and (2.4), respectively. As for the GI construction, weak GI 


procedure is used. The reason for not using strong GI procedure is that the bias components 
(see Lemma B.2 and B.3 in Appendix for explicit formulas) needed are very difficult to 
estimate accurately. In fact, our numerical tests show that the bias estimation brings new 
error that overwhelms the bias itself. The simulation results are summarized in Table 15.2 
and 15.21 

We have the following observations: 


(1) In general, there are significant gaps between the expectations of mean sojourn time 
(column 3) w.r.t. input uncertainty and VaR or GVaR of mean sojourn time (columns 
4 to 6) w.r.t. input uncertainty. It implies that risk quantihcation in stochastic simu¬ 
lation under input uncertainty is necessary. 


(2) As the size of input data increases, VaR or GVaR of mean sojourn time decreases, which 
indicates that the risk in simulation due to input uncertainty decreases. Intuitively, as 
more input data become available, the belief distribution on input parameter becomes 
more concentrated on the values close to the true one. Therefore, loosely speaking, the 
mean response distribution is also more concentrated on the values close to the true 
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Table 5.2: VaR (with 95% Cl) of Mean Sojourn Time in an M/M/1 Queue. 


Ao 

n 

Mean =f 

Half Cl Width 

VaRa^ =F 

Half Cl Width 

VaRa2 T 

Half Cl Width 

VaRa3 T 

Half Cl Width 

50 

10 

2.4 X 10“^ 

3.4 X lO-*^ 

T 

3.7 X 10-^ 
7.3 X 10-^ 

T 

4.5 X 10-3 
1.3 X 10-3 

T 

6.5 X 10-3 =p 

1.7 X 10-3 

50 

100 

2.2 X 10“^ 
9.7 X 10“® 

T 

2.6 X 10“^ 
5.1 X 10“^ 

T 

2.8 X 10-3 

4.8 X 10-^ 

T 

3.1 X 10-3 ^ 

6.1 X 10-^ 

50 

10000 

2.2 X 10“^ 
6.9 X 10“® 

T 

2.4 X 10“^ 
5.6 X 10“^ 

T 

2.5 X 10-3 

4.6 X 10-^ 

T 

2.7 X 10-3 ^ 

5.6 X 10-^ 

250 

10 

5.2 X 10“^ 
2.1 X 10“^ 

T 

9.7 X 10“^ 
4.4 X 10-3 

T 

1.6 X 10-2 
9.5 X 10-3 

T 

4.3 X 10-2 =F 

3.1 X 10-2 

250 

100 

4.2 X 10“^ 
4.1 X lO-*^ 

T 

5.7 X 10-3 
1.6 X 10-3 

T 

6.5 X 10-3 
2.2 X 10-3 

T 

8.7 X 10-3 ^ 

4.3 X 10-3 

250 

10000 

3.9 X 10“^ 
1.8 X lO-*^ 

T 

4.5 X 10-3 
1.1 X 10-3 

T 

4.7 X 10-3 

1.7 X 10-3 

T 

5.1 X 10-3 =p 

1.4 X 10-3 

450 

10 

9.9 X 10“^ 
3.3 X 10“^ 

T 

2.4 X 10-^ 
1.6 X 10-2 

T 

3.4 X 10-2 
2.7 X 10-2 

T 

5.5 X 10-2 =F 

4.1 X 10-2 

450 

100 

1.8 X 10“^ 
3.6 X 10-^ 

T 

3.5 X 10-2 

2.6 X 10-2 

T 

4.2 X 10-2 
2.8 X 10-2 

T 

5.3 X 10-2 ^ 

3.7 X 10-2 

450 

10000 

2.1 X 10“^ 
2.6 X 10-^ 

T 

3.0 X 10-2 
2.4 X 10-2 

T 

3.4 X 10-2 

2.5 X 10-2 

T 

4.1 X 10-2 =F 

2.7 X 10-2 


The experiment parameters are: po = 500, N = 5000, M = 200, ai = 0.90, = 0.95, and 

0:3 = 0.99. 

mean response, and essentially reduce the risk of large mean sojourn time. 

(3) As Ao increases (arrival traffic intensihes) and approaches the service rate po, the 
system becomes less stable and the risk in simulation due to input uncertainty is more 
signihcant. Therefore, more input data is required to reduce such risk to an acceptable 

level. 

We further study the associated budget allocation problem. Note that for VaR estimation 
and CVaR estimation, the budget allocation problem might yield different optimal allocation 
schemes. Let C(V, M) = NM + N and CB = 5 x 10^. We use NpHot = 50 outer scenarios 
and Mpiiot = 100 inner samples for each scenario in the pilot experiment to guide the budget 
allocation in the actual experiment. In total, only 1% percent of total budget is consumed, 
and the budget for the actual experiment is barely affected. To exhibit the effectiveness of 
the pilot experiment, we plot the Cl widths for different choices of N in Figure [T| where 
the blue curves are the Cl widths calculated using variance terms estimated from the pilot 
experiment, and the red curves are the Cl widths calculated using the true variance values 
obtained by simulation-to-death (i.e., using extremely large sample sizes). 

We make the following observations: 

(1) In both plots, although there is a non-negligible gap between the Cl width (blue curve) 
computed using the variance terms estimated from the pilot experiment and the true 
Cl width (red curve), the curves follow the same trend and their minima coincide. This 
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Table 5.3: CVaR (with 95% Cl) of Mean Sojourn Time in an M/M/1 Queue. 


Ao 

n 

Mean =f 

Half GI Width 

CVaRai T 

Half GI Width 

CVaRa2 T 

Half GI Width 

CVaRa^ T 

Half GI Width 

50 

10 

2.4 X 10“^ 

3.4 X lO-*^ 

T 

5.0 X 10“^ 
2.8 X 10“^ 

T 

6.0 X 10-5 
4.7 X 10-^ 

T 

9.0 X 10-5 ^ 

1.6 X 10-5 

50 

100 

2.2 X 10“^ 
9.7 X 10“® 

T 

2.8 X 10“^ 

4.9 X 10-5 

T 

2.9 X 10-5 

6.9 X 10-5 

T 

3.2 X 10-5 ^ 

1.6 X 10-^ 

50 

10000 

2.2 X 10“^ 
6.9 X 10“® 

T 

2.6 X 10“^ 
3.3 X 10-5 

T 

2.6 X 10-5 

4.7 X 10-5 

T 

2.8 X 10-5 ^ 

9.8 X 10-5 

250 

10 

5.2 X 10“^ 
2.1 X 10“^ 

T 

2.1 X 10“^ 
2.4 X 10-5 

T 

3.1 X 10-^ 

4.2 X 10-5 

T 

5.3 X 10-^ T 

9.6 X 10-5 

250 

100 

4.2 X 10“^ 
4.1 X lO-*^ 

T 

7.0 X 10-5 
3.3 X 10-^ 

T 

7.8 X 10-5 
5.6 X 10-^ 

T 

1.0 X 10-^ =F 

2.0 X 10-5 

250 

10000 

3.9 X 10“^ 
1.8 X lO-*^ 

T 

4.8 X 10-5 
9.7 X 10-5 

T 

4.9 X 10-5 
1.4 X 10-^ 

T 

5.3 X 10-5 ^ 

3.2 X 10-^ 

450 

10 

9.9 X 10“^ 
3.3 X 10“^ 

T 

3.8 X 10-^ 
3.0 X 10-5 

T 

4.7 X 10-^ 
4.6 X 10-5 

T 

6.8 X 10-^ T 

1.1 X 10-2 

450 

100 

1.8 X 10“^ 
3.6 X 10-^ 

T 

4.3 X 10-^ 

2.4 X 10-5 

T 

4.9 X 10-^ 
3.3 X 10-5 

T 

5.8 X 10-2 ^ 

7.8 X 10-5 

450 

10000 

2.1 X 10“^ 
2.6 X 10-^ 

T 

3.5 X 10-^ 
1.7 X 10-5 

T 

3.8 X 10-^ 
2.4 X 10-5 

T 

4.4 X 10-2 =F 

5.7 X 10-5 


The experiment parameters are: po = 500, N = 5000, M = 200, ai = 0.90, = 0.95, and 

0:3 = 0.99. 


implies that solving the formulated budget allocation problem could identify the opti¬ 
mal budget allocation scheme. In light of the fact that only 1% of the total simulation 
budget is used, we could claim that our budget allocation problem and its solution 
strategy provide effective guidance in determining good budget allocation schemes. 

(2) By comparing the difference between maximum and minimum of the red curve (either 
one), we can see that using an optimal budget allocation scheme could narrow a Cl by 
3 to 4 times. When the total simulation budget is limited, solving the budget allocation 
problem is very beneficial. 


(3) The best budget allocation schemes for VaR estimation and CVaR estimation are 
drastically different. In particular, the optimal iV for constructing Cl of VaR is around 
10^ while the optimal V constructing Cl of CVaR is around 10^. This is different from 
the result by Gordy and Juneja (2010), in which the optimal {N,M) for minimizing 
MSEs of nested VaR and CVaR estimators are both N = 0{M‘^). This is because 
the budget allocation problems in our framework and Gordy and Juneja (2010) have 
different objective functions, and thus lead to different optimal solutions. 


(4) Another phenomenon worth mentioning is that the GI width for GVaR estimation 
appears to be decreasing in N (see right half of Figure [^. This is due to the structure 
of the objective function (4.2). It is easy to see that, as N increases, the hrst term 
in (4.2) decreases but the second term remains almost unchanged since NM » N. 
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Figure 1: VaR and CVaR Cl Widths: Pilot Run V.S. Actual Run 
The experiment parameters are: Ao = 150, fio = 500, a = 0.95, and the size of input data n = 10. 

Therefore, the optimal solution is to increase N or decrease M as much as possible, 
until M hits the low bound Pq. 

In conclusion, the simulation results for the M/M/1 queueing system provide empirical 
evidences for the importance and necessity of risk quantihcation in stochastic simulation 
under input uncertainty, as well as the advantages of solving the associated budget allocation 
problem for efficient nested simulation. 

6 Conclusion 

In the present paper, we introduce risk quantihcation in stochastic simulation under input 
certainty, which rigorously quantihes extreme scenarios of mean response in all possible 
input models. In particular, we propose nested Monte Carlo simulation to estimate VaR 
or CVaR of mean response w.r.t. input uncertainty. We prove the asymptotical properties 
(consistency and normality) of the resulted nested risk estimators in different limiting senses 
under different sets of regularity conditions. We further use the established properties to 
construct (asymptotically valid) CIs, and propose a practical framework of optimal budget 
allocation for improving the efficiency of nested risk simulation. The work in this paper can 
be viewed as a starting point of research on more general risk measures for risk quantihcation 
under input uncertainty. 

On the other hand, the naive nested risk estimators considered here could be restrictive 
in risk quantihcation under input uncertainty for large-scale systems, due to the inefficiency 
of naive rare-event simulation. The budget allocation problem solved in this paper partially 
addresses this issue in the sense that it leads to good outer versus inner sample size tradeoh 
in reducing Cl width. Developing more sophisticated budget allocation schemes will be a 
promising direction of future research. 
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A Proof of Theorem 3.1 


For simplicity, let us use , c^, , and to denote Va, Ca, Va, and Ca, respectively. 

Therefore, we need to show that 

lim lim = Va, w.p.l, and lim lim = Cq, w.p.l. 

N^oo M^oo N^oo M^oo 

In view of the error decomposition 

- v^) + {v^ - V.) and - c^) + (c^ - c.) , 

it is sufficient to show that 


N^OO 


lim ( 


M^OO 


^N,M _ 


Va) = 0, W.p.l. and lim (cY — Ca) 

' N^oo ^ ' 

= 0, 

w.p.l. 

(A.l) 

.., 9n, 




) = 0, w.p.l. and lim (c^’^ — c^) 

' M-s-oo ^ ' 

= 0, 

w.p.l. 

(A.2) 


To establish (A.l), we need the following lemma, and its proof can be found in online 
appendix. 


Lemma A.l. Under Assumption 3.1\ (ii), 

1 


= 


f{Vc 


« - ^ X] < Uo} j + An, 


N 


N 




i=l 


Va + 


2=1 

1 

1 — a 


- VaY 


— Co I + Bn, 


(A.3) 

(A.4) 


where An = Oa.s.{N = Oa.sXN ^logA). Here note that the statement 

g{N) = Oa.sX^i^)) vfieans that g{N) < C ■ h{N) almost surely for some constant C. 


Proof. The asymptotical representation (A.3) is exactly Theorem 2.5.1 in Serffing (2009) 


under Assumption 3.1 (ii). The asymptotical representation (A.4) is the special case of 
Theorem 2 in Sun and Hong (2010), when the importance sampling measure C = 1. □ 


N 


Notice that ^ ^ Va} is an unbiased sample estimator of a. By Strong Law 

2=1 

of Large Numbers, 

1 ^ 

w.p.l. 

N^OO I\ ^ 

2 = 1 

Combining with the fact lim A^v = 0, w.p.l, lim — Uo) =0, w.p.l. To show the latter 

N^OO N^OO ^ 

N 

half of (A.l), notice that X] {H{9i) — Uq,)’''] is an unbiased sample estimator of 


2=1 


Ca- Furthermore, by Assumption 3.1 (i). 


E[H\9)] =E[E2[h(0;OI^]] = / EYhi9-,^)\9]f{9)d9 < / E[h^{9;f)\9]f{9)d9 < oo. 
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Therefore, Var{H{9)) is finite and Var{va + {H{9) — t’o,)''') is also finite. By Strong Law 

of Large Numbers, 


1 ^ 

lim — 

7V-5-00 N ^ 
2=1 


Va + 


OL 


{i^{9^ 



0 , w.p.l. 


Combining with the fact lim Bjq = 0, ta.p.l, lim (c^ — Ca) = 0, w.p.l. (A.l) has been 
established. 

It remains to establish (A.2) for hxed N and scenarios 9i,...,9]s[. That is, we need to 
show for fixed N and scenarios 9i, ...,9 n, 

hm - H{9^^^m)) = 0, w.p.l, 


M^oo 


lim 


N 


N 


M^oo \ (1 - a)N 

\ '■ ' -J —, 


E E m,) 


i=aN 


(A.5) 

= 0, w.p.l. (A.6) 


i=aN 


Recall that for any 9i,i = 1,...,N, ¥\h{9i,^)\9j\ = H{9i) and Var[h{9i,^)\9j\ = rf < oo, 
where we use rf to denote Tq_ with slight abuse of notations. By Strong Law of Large 

Numbers, we have for i = 1,...A^, HM{9i) H{9i), w.p.l. Let hlj C be the set of 

such convergent scenarios for i = 1,...,N, where hi is the underlying sample space. Thus 
P{VLi) = 1. Denote Vt ;= Hili intersection of all convergent scenario sets. Clearly, 

by Boole’s Inequality PiS^) = 1. Let us also denote, for any scenario w G 0, H^{9) as the 
sample realization of Hm{ 9), i = 1, ..,N. Therefore, Ww G D 


hm iHZi9,),...,H^i9^)) = iHi9,),...,Hi9^)). 

M^oo 


(A.7) 


Let e := | min{iL(6*i) — H{9j) : i ^ j, i,j = 1,...,A^}. By dehnition, (A.7) implies that 
there exists a sufficient large such that VM > M^, \H^{9i) — H{9i)\ < e,i = 1, ...,N. It 

follows that, VM > M^, 




That is, VM > M^, the sampling error so small that the order sequence of the mean response 
is not perturbed. Thus, VM > M^, .., = (0(1),..., 6'(jv)), where 9w^ is the sample 

realization of 0^*^ with scenario w. Therefore, for any scenario tc G 0, 

lim %,(«£■"’) = lim = ^^(«(.«)). 

M^oo M—i-oo 


and 


N 


N 


N 


lim -r , 

M^oo (1 - a)N ^ 

^ ' i=aN 


E = ,lim 


M^oo (1 - a)A^ ^ 

^ ' i=aN 




(1 - a)N „ 

' ' i=olN 


E "('»(•))■ 


Notice T’(D) = 1, (A.5) and (A.6) naturally hold. 
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B Proof of Theorem 13.2 


Recall we need to show that 


lim = V a, w.p.l, and lim = Ca, w.p.l. 


N,M—>‘O0 


N.M^OO 


In addition to the notations previously introduced in Appendix let us further use and 
to denote Va{HM{0)) and Ca{HM{0)), respectively. That is, and are the exact 
a-level V aR a nd CVaR of noised mean response Hm{0), respectively. As mentioned after 
in view of the fact that Va{H{6)) = Va{HM{d)) and Ca{.H{6)) = 


3.2 


Theorem _ 

and could be regarded as the one-layer Monte Carlo estimator of and , 
respectively. This observation inspires us to consider the following error decomposition 

- C) + (C - -.) and - c^) (cf - c„) . (B.l) 

Therefore, it is sufficient to show that 


lim = Va and lim c“ = c^. 


M 


M^oo 


M^oo 


and uniformly for all M, 


lim w.p.l and lim w.p.l. 


N^OO 


N^OO 


(B.2) 


(B.3) 


Let us hrst establish (B.2). The following lemmas will be useful, and we refer to online 
appendix for the proofs. 


Lemma B.l. Under Assumption 3.2, if a sequence Im —)■ t as M —)■ cx), then fM{tM) —t f{t) 


and f'{t) as M ^ oo, where recall fM{') is the p.d.f. of noised mean response 

HMiO). 


Proof. This result is exactly Lemma 1 in Gordy and Juneja (2010). For convenience, we will 


briefly present the proof. Recall that Hm{0) = H{6) + Sm/VM, where {H{6),Sm) has a 
joint distribution Pm( h, e). Therefore, 

InitM) = / PM{tM - e/\fM,e)de and f{t)= / pM{t,e)de. 


It follows that 


fnitn) - f{t) = j (pM{t - e/ Vm, e) - pm(C e)) de. 


By Taylor series expansion, this equals 




d ~ 

e^PM (tju, ejde. 


where Im lives in between tM and t. By Assumption 1 and the fact that ^m —^ ^ as M —)■ cx), 
both terms converge to zero as M —)■ cx). □ 
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Lemma B.2. Under Assumption 3.2 


wM _ I iVa) ^ 


where the function A{t) = l/2f{t)'E[Tg\H{9) = t] and om{-^) means this quantity goes to 
zero faster than A (almost surely). 


Proof. This result is very similar to Proposition 1 in Gordy and Juneja (2010). The proof 
here will mainly follow Gordy and Juneja (2010)’s proof. 

Recall that Tm(-) is the c.d.f. of the noised mean response Hm{9), and is the exact 
a-level VaR of Hm{0). Thus, Fm{v^) = «. By Taylor expansion, we have 


a = Fm{v^) = Fm{vo) + {v^ - VoOfniva) + 
where lives in between and Va- Therefore, 

a - FM{Va) = {v^ - Va)fM{Va) + 
Furthermore, notice that 


— n, 




n JMK^a J) 


(B.4) 



and 


FuiVa) = / fM{t)dt = 


a = F{va) = / f{t)dt = 


Vot—ej y/M 


PM(t, e)dtde, 



Gombining (B.5) and (B.6), we have 

a - Fuiva) = 

By Taylor expansion, we have 



Va—elpM 


Pnit, e)dtde. 


PMit, e)dtde. 


(B.5) 


(B.6) 


(B.7) 


d {t — 

PM{t,e) = PM{Va,e) + {t- Va)-^PM{Vo„e) + ^ 


where Va lives in between Va and t. Hence, 


a - Fm{vo) = 





Puiva, e)dtde + 


d 

{t - VaO-^PM{va,e)dtde 

R Jva-e/UM 



+ 



Va—elpM 


2 dP 


PAii^a, e)dtde. 


(B.8) 


The first term of the right hand side of (B.8) satisfies 



Va — e/y/M 


PM{.Va, e)dtde = 




-^PM{va,e)de = -j=E[SM\H{e) = n„] = 0. 
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The second term of (B. 8 ) satisfies 



, ,d 1 /■ 2 ^ 

[t - Va)—pM[va,e)dtde = / e — 

Va-elVM ZIVl ./b at 


e^—pM{va,e)de 


1 d 

mwt 

1 d 

mwt 

1 


/ e^pM{va,e)de 

Jm. 

f{v^)E[T^\H{9)=v, 




By Assumption 3.2, the third term of (B. 8 ) is in the order of Om{M 2 ). Therefore, 

a - Fm^Vo) = 


(B.9) 


Combining (B.9) with (B.4), we have 




-Va)fM{Va) + 




where note that by Assumption 


3.2 


t and M. Combining with Lemma B.l Lemma B.2 holds. 


it is easy to see that f^if) is uniformly bounded for all 

□ 


Lemma B.3. Under Assumption 3^ 




(B.IO) 


Proof. The result here is very similar to Proposition 3 in Gordy and Juneja (2010), and 
our proof will mainly follow Gordy and Juneja (2010 )’s proof. Note that by Mean Value 
Theorem, 


c" = 

1 -a 


E 


1 


1 — a 


tfM{t)dt + 


1 — a 


-E 


1 — a j 

X ^ ^OL 


1 — a 
tfM{t)dt 
1 

+ 


tfM{t)dt 


Va - Va%fM{tv), 


1 — a 


where lives in between Va and . By Lemma B.2| we know 

1 


/ £ (j. \ '^aA. {Va) ^ 


Therefore, 




1 — a 


-E 




, Vo,A\Vof) , ^ 1 ^ 
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Further notice that 
1 


1 — a 


Cr\ — 


-E 


1 — a 


HM{e) ■ > uj 


E[Hie)-i{H{e)>v^}] = 


1 — a 
1 

1 — a 



Va—el^/M 


{t + e/ '/M)pM{t, e)dtde, 



e)dtde, 


and 


Therefore, 



epM{t,e)dtde = / E[SM\Hid) = t]f{t)dt = 0. 


c^-c = 

“ “ 1-a 



tpM{t, e)dtde H— 

'y N 


' Va—e/y/M 


Pui^i e)dtde 


n„A'(n„) 1 

(1 -a)M 


(B.ll) 


Similar to the derivation (by taking Taylor expansion) from (B.7) to (B.9), we have 
1 


1 — a 



tpM{t,e)dtde = - -vdT “ Ti-777 + *^^ ^ ' ’ 

vo^-eiVM 1 - a)M 1 - a)M 


and 


l-ay/N 


[ puit, e)dtde = 2 

Ivc-e/VM (1 - (^)M 


+ Om{M 2 ). 


Combining (B.ll), (B.12), and (B.13), (B.IO) holds and Lemma B.3 is proven. 


(B.12) 

(B.13) 

□ 


Lemma B. 4. Under Assumption \3. 2 

1 


N 




/(t 


M'\ 




i=l 


N r 


p^"-a = (^E 


2=1 


+ 


1 — a 


HM{0i)-v, 


M 


-c +Oa.s.(A^-MogiV), (B.15) 


where Oa.s.{N ^/^(log A^)^/'^) andOa.s.{N MogiV) hold uniformly for all M. 


Proof. Let us hrst establish (B.14). For simplicity, let us use G{-) and Gm{-) to denote the 
inverse functions of F(-) and Fm{-), respectively. Furthermore, denote U{9) = Fm{Hm{9))- 
Clearly, Hm{0) = Gm{U{6)) and = Gm{,ol). It is easy to see that U{6) is uniformly 
distributed over [0,1]. Moreover, from the relationship Hm{6{i)) < ■ ■ ■ < Hm{9(n))) we 
know that U{6{i)) < ■ ■ ■U{9 (^n')) is the corresponding order statistics of N i.i.d. uniformly 
distributed random variables. Furthermore, let us use Fff{-) to denote the sample c.d.f. 
induced by U{9i), ...,U{9 n). That is 


N 




2 = 1 
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By Lemma A.l, we know that 


N 


U{e^^N))-a = <«} ) +Oa.s{N-^^\^ogNf/^). (B.16) 


2 = 1 


Furthermore, by Taylor expansion, 

^N,M ^ HM{d{aN)) = GM{U{6(^aN))) 

= Gm{(^) + {U{9(^aN)) — ot)G']^{a) + - '^—G"j^{u) 


= c + 




{U {9(^aN)) — a) + 


/m(Gm(m))\ iU{9(^aN)) - OiY 


fliiGMiu))^ 


where u lives in between U{9(aN)) and a, and we use the facts that Gm{,oi) = G'j^{a) = 

^/Jm{v^), and G%{u) = (Gm(m))- Therefore, 


~N,M _ y M ^ 


fMiv. 


-{U {9(^aN)) — cn) + 


/m(G'm(m))\ {U{9(^aN)) - a)" 


fLiGMiu)) 


(B.17) 


On the other hand, by Lemma 2.5.4B in Serfling (2009), we have for sufficiently large N 


\U{9^aN))-a\<2N-^ (logiV)^ 


Combining with (B.16) and (B.17), we have 

N 


^n,m _^m ^ 

r\i r\i 


fuiv. 


iWi \ iV 


< «} -« +Oa.s.(iV-3/7logiV)3/7 


2=1 


N 


fMiv, 


M\ \ N 


<«}-«+ ~ Oa.sXN-^^'i^ogNf^) 


2=1 

N 


fMiv, 


M\ 

a 


fMiv, 


M\ \ N 


Y,HHMi9^)<v^^}-a] +.^-—0„.,.(iV-3/4(logiV)3/4). (B.18) 


2=1 


fMiv^) 


Notice that fMiv^) is strictly positive and fiviiv^) /(uq.) as M — )■ oo. Therefore, 
supl//M(h^) < oo. Thus, (|B.14) holds. 

M 

It remains to show (B.15). Notice that by dehnition 

, N 


~N,M _y.M ^ 


= 5"’" + 


7 -. -^ > 5"'“} - 

(1 -ajiV ^ 


N 


(l-a)iV^ 




V 


■N,M 


N r 


N 


E 

2=1 


v^ + ^- 
“ 1 - a 


HMi9^) - V 


xM 


-M 
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1 ^ r + 

+ - »*') + 7^^:^ Z - ("«(»<) -»") 

^ ' 2=1 L 


N r 


iV 


E 

i=l 


“ 1 - a 




-M 


where 


W:= + 


1 ^ 

(1 - a)/V E |_ 


^?M(9i) - 5 


'NM 


- [Hum-v, 


M 


We only need to show that (*) is in the order of Oa.s.{^ ^ logiV) uniformly for all M. Note 
that the second term in (*) 


N r 


^ ' 2=1 ^ 

1 ^ 

,jy E [{«M(0i) - ’") l{ffM(9.) > 5"'"} 

^ ' i=l 

- (Hm^ - hf) l{HMm > v^} 

N 


(l-a)Af^ 
1 


N 


J2 [HMm - C) > V^’^} - t{HMiO,) > €} 


{l-a)N 

1 


1 1 ^ 

(»f - s:-") + 7^^:^E -«") !{"*'(«<) < s^'"} 


AT 


(l-a)iV^ 
' ' 1=1 

1 


E 


{l-a)N ^ 

^ ' 1=1 

[t{HMm < hf} - i{HMm < 

N 


{l-a)N 


^ ^ i=l 


where 


N 


* * * = 


E 


7M 


(l-a)iV^ 

' ' 1=1 

Further note that 

(^NM _ .M) ^ 


(l-a)iV 


t{HM{ei) < h“} - l{HMm < 




(l-a)iV^ 


N 

E [h"’" - »") 1{5 m(9.) < 5"’"} 
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(1-a) 

1 

(1-a) 

1 

(1-a) 




N 


(e 


M ^ 


N 

1 

N 




i=l 

N 


J2HU{9,)<U{9(^o.N))}-a 


2 = 1 


(-«,« _ -M) (F»(t;( 9 ,„„))) -a)= («) 


Note that (*) = (**) + (***), we only need to show that (**) and (* * *) both are in the 
order of Oa.s.{^~^ logiV) nniformly for all M. 

By Lemma 2.5.4B in|Serfling 


this is nniform for all M, as in (B.18)) 


(2009), we know that for snfficiently large N (can be verihed 
iV-5 (logiV)^ . 


yiV.M_ .M| < 




(B.19) 


Moreover, by applying Theorem 2.5.1 and Lemma 2.5.4B in Serfling (2009) on U{9), we have 
for snfficiently large N 


\F^{a) - a| = 2iV-i (logiV)5 + Oa.sXN~^^\^ogNf/^). 


(B.20) 


Applying Lemma 2.5.4B and Lemma 2.5.4E (with cq = 2, g = 1/2) in Serfling (2009) on 
U{9), we have for snfficiently large N 


\FuiUi9iaN))) - («)| = 2iV-i (logiV)^ + 0,.,,(iV-3/4(logiV)3/4). 

Combining ( |B.20 ) and ( B.21[ ), we have for snfficiently large N 

IAII'WoA-))) - a| < 4iV-5 (logiVji + 0^,,,{N-^/*{logNf*). 


(B.21) 


Combining with (B.19), we have for snfficiently large N (nniform for all M) 

{N-^ (logiV) + Oa.s.{N-^^\^ogNf/^)) 




fM{v, 


M) 
a J 


In view of the fact that snpl//M(h^) < oo, we have (**) in the order of Oa.sX^ ^ logiV) 


M 


nniformly for all M. What is left is show (* * *) is also in the order of Oa.s.{F^ ^ logiV) 
nniformly for all M. 


(***)! = 


N 


< 


(l-a)iV^ 
1 


[HM{9i 




t{HMm < v^} - t{HMm < 


(1 - a) 
1 

(1 - a) 
1 

(1 - a) 


~N,M 



- w 1 

~N,M 



- W 1 

~N,M 


V 

\ 


1 ^ ^ 1 ^ ^ 

^ < «"} --Y, < 5"'“} 


N 


i=l 

N 


i=l 


N 




N 


2=1 


2=1 


F:{U{9^^^)))-FXX{a) 
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By (B.19) and (B.21), we have for sufficiently large N (uniform for all M) 


(**) ^ 


(1-a) 

4 

Mv^) 


.Ml 




N / 


{N-^ (logiV) + Oa.s.{N-^'\^ogNfl^)) . 


Again, in view of the fact that sup 1//M('ya) < oo, we have (***) in the order of Oa,s,(A^ ^logA^) 

M 

uniformly for all M. □ 


(B.3). 


By Lemma B.2 and Lemma B.3 (B.2) naturally holds. Furthermore, Lemma [B.4| implies 


C Proof of Theorem 3.3 


Following the notations introduced in Appendix and we need to show 

lim a/ZV ( v ^ — Va) ^ cr„A/'(0,1) and lim a/ZV (c^ — c^) ^ cTcA/'(0, 1), 

N^OO ^ ^ ' 


N^OO 


(C.l) 


where 


cr. 


= y/a(l - a)/f{va) and cxc = ^Jvar [{H{9) - n„)'^]/(l - a). 


Furthermore, 


lim lim a/M (v^’^ — v^) ^ TyAf{0, 1), 

M^OO ' 

lim lim -^(1 — a)NM (c^’^ — c^) ^ TcJ\f(0, 1), 


(C.2) 


where 


Ty^E[T^\H{e) = Va] and Tc = 


Let us first establish (C.l). This is a direct result of Lemma [A. 1 , where note that the order 
of Ajsf and are strictly smaller than Furthermore, H{6iys are i.i.d. random 

variables, and thus 


< = NVar 


N 


N 


f{Vo 


a 


N 


PMN 


Var [l{H{e) <n„}] = 


«(! — a) 


2=1 


P{Va) ’ 


and 


= NVar 


N r 


N 


E 

2=1 


Vq. + 


1 — a 


{H{dy - VoP 


- Cn 


(1-a) 


-,Var [{H{d)-Vo)-^] 


Next, let us establish (C.2). For hxed N and scenarios ^i, ...,6n, we have 





































and 


N 


N 




{l-a)N 




i=(yN 


(l-a)N , 




i=aN 


On the other hand, by Central Limit Theorem nnder Assnmption 3.1 (ii), 

Vm {Hm^Oi^o^n)) - h^(6'(aiv))) ^ T(«7V)-A/'(0, 1), 
where stands for = Var[h{ei)\d{^ n)]- Note that 

- VM (#M(^(aW)) - i^(0(aiV))) = VM " ^m(0(.7V)) 


(C.3) 


By nsing the same techniqne in proving (A.5) and (A. 6 ), we will show that 


hm - ^M( 0 (a 7 V))) = 0 . W.p.l. 


M^oo 


(C.4) 


Indeed, denote the nnderlying sample space by O. In Lemma [A. 1[ we have established that 
Mw G fl, there exists an snch that VM > M^, ..^9^'^) = ( 6 ^( 1 ), ...,6'(Ar)), where 9w^ is 

the sample realization of 0 ^) with scenario w. It follows that VM > M^, 

'/M ('//£(»£■">) - /?„(»(„»))) = 0. 


Thns, (C.4) holds. Combining with (C.3), we have 


Tm r(„^)Ar(0,1). 


Fnrthermore, similar to showing lim = Va^w.p.l^ we can show that 

N^oo 


Thns, 


lim = JE[T^\H{e) = n„] = r^. 

N^oo ^ ’ V 

lim hm VM [Em{9^^^^) - 4 r,A^( 0 , 1 ), 


N^oo M^oo 


and the hrst half of (C.2) holds. It remains to establish the second half of (C.2). Note that 


by Central Limit Theorem, 

/ W AT 

lim_ ^(l-a)NM ( „ , ^ Y1 ^ !)• 


M^oo 


(1 -a)Ar 


i=aN 


i=aN 


(C.5) 

where n := ■y/EEjv / [(1 — a)A^] and is short for Following a similar argnment 
in showing (C.4), we can show 


hm y/(l — a)NM 


N 


N 


M^oo 


(1 - a)N 

^ ^ i=aN 




(1 - a)N 

' ' i=aN 


E HMiOii)) = 0 , w.p.l. 
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Combining with (C.5), we have 


N N 

5: #„(«(.>) - 5: H(e,,) 1 S rW(0.1). 

^ ^ i=0'N ^ 


i=aN 


Notice that 


N 




lim rj = lim ^ = ¥.[Tl\H{e) > n„] = rl 

N^oo N^oo (1 — CX )I\ 

i=aN ^ ' 


The latter half of (C.2) holds. 


D Proof of Theorem 3.4 


Follow the notations in Appendix and we need to show that under Assumption 3.2 
A^ = is a sufficient and necessary condition for 


lim - Vo) ^ cr„A/'(0,1), 

N,M^oo ^ ’ 

lim \fN (c^’^ - Co) ^ cTcA/'(0, 1). 


N.M^OO 


(D.l) 

(D.2) 


By Lemma |B.2| , |B.3| and |B.4[ we have 

1 


N 


- V. 


0 = 


- C. 


/(h 


M\ 
a } 




2=1 


^ + o«(i) + o.uiv-/pogiv)3/y 


) = L 


N 

E 

2 = 1 

A(Ua) 


v^ + ^- 
“ 1 - a 


Huie) - V, 


-M 




Note that —)■ Va, and f{v^) —)■ f{va) as M — )■ cxd. We have 


lim 

M—>-00 j^^yM 


^a-l{HM{0) <v))}) =<Va}), w.p.l, 


/(w) 


and 


lim ( + - - 

M—>-oo \ 1 — a 


Hm{0)-v, 


M 


- c:f I = I + 


1 — a 


{H{9) -Vo)^ - Co] , w.p.l 


Therefore, by Central Limit Theorem, (D.l) and (D.2) hold if and only if A^ = om{M^). 
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E Proof of Theorem 3.5 


To establish part (i) of Theorem 3.5, by Boole’s Inequality, it is sufficient to show the 
following limits 


lim lim P{|T^rri| < 

lim lim P{|Prr 2 | < 21 = x —/3o, 

vT^-ooM^oo vAf 

lim lim PUErr^l < = I — 

^^00 M—>-oo Y {l—cx)NM 


lim _hm P{|Prr 4 | < = 1-^0- 


\ N^OO M^OO 


where recall that Erri - Err^ are dehned in (3.3) and (3.4). In view of the fact that a 


Student’s t-distribution converges to a standard normal distribution as the degree of free¬ 
dom goes to inhnity, the almost sure convergence of variance estimators by Strong Law of 
Large Numbers, and the consistency of kernel density estimation, these limits naturally hold 


following Theorem 3.3 Similarly, part (P) of Theorem 3.5 can be established. 
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